Model Calibration as Part of Model Evaluation in a Model Evolution System

Model calibration serves the purpose of comparing a (data) model revision against a benchmark (expert) model. It is a validation process that ensures external quality assurance. The key question asked during the calibration process is: "Does the proposed new model meet stakeholder needs?"

The calibration process captures the following 3 questions that determine the extent to which a proposed model is fit for purpose:

  1. Data integrity: How good is my data?
  2. Model integrity: How well does my new model match my existing model?
  3. Abstraction integrity: How well does my model represent reality?

Generate Data Model


In [1]:
import features.feature_ts as ts
import evaluation.calibration as ec
import evaluation.evalplot as ep
import benchmark.bm0 as bm0

import pandas as pd
import numpy as np

import plotly.offline as offline
import plotly.graph_objs as go
import plotly as py
offline.init_notebook_mode(connected=True) #set for plotly offline plotting


User Input

Set the year and experiment directory.


In [2]:
#Select data and model
year = 2011
experiment_dir = 'exp1_BNve_1' # sub-directory in which inferred customer classes are saved

Retrieve Experimental Model

first run

experiment.algorithms.BNve.saveClasses()
experiment.experimental_model.saveExpModel()

In [5]:
ods = pd.read_csv('data/experimental_model/exp1_BNve_1/demand_summary_2011.csv')
ohp = pd.read_csv('data/experimental_model/exp1_BNve_1/hourly_profiles_2011.csv')

ohp.head()


Out[5]:
class YearsElectrified month daytype hour AnswerID_count total_hours valid_hours kw_mean kw_mean_diversity kw_std kw_std_diversity valid_obs_ratio
0 informal_settlement 1 1 Weekday 0 8 168 168.0 0.197733 0.120472 0.090666 0.072823 1.0
1 informal_settlement 1 1 Weekday 1 8 168 168.0 0.190615 0.117861 0.088619 0.077481 1.0
2 informal_settlement 1 1 Weekday 2 8 168 168.0 0.181404 0.113975 0.068271 0.047859 1.0
3 informal_settlement 1 1 Weekday 3 8 168 168.0 0.208552 0.133402 0.073348 0.057561 1.0
4 informal_settlement 1 1 Weekday 4 8 168 168.0 0.231796 0.115835 0.106534 0.048827 1.0

Explore Data Integrity

How good is my data?

The purpose of an uncertainty index is to assess the data integrity. The key questions that the uncertainty index answers are:

  1. Do I have enough representative data?
  2. Is the data sufficiently reliable to construct a model with integrity?

The uncertainty index is calculated by establishing whether the sample size is sufficient to draw a conclusion about a certain characteristic feature of the model. In this system it is derived by selecting a valid model based on:

  • a specified minimum number of profiles observed
  • a specified minimum number of valid observations per model variable

The uncertainty index is the ratio of variables (rows) in the valid model to total variables. It is calculated as follows:

valid_submodel = submodel_input[where AnswerID_count >= minimum and valid_obs_ratio >= minimum]
uix = rows(valid_submodel) / rows(submodel_input)

Moreover, for a model to be valid, it must share the same baseline as the benchmark model (eg same year, same region).

User Input

Set data integrity thresholds.

min_answerid = minimum number of observed Answer IDs in subclas
min_obsratio = minimum valid observations in subclas

Subclasses that do not meet these minimum requirements are discarded.


In [3]:
min_answerid = 4
min_obsratio = 0.85

In [6]:
ohp.loc[ohp.YearsElectrified==6].describe()


Out[6]:
YearsElectrified month hour AnswerID_count total_hours valid_hours kw_mean kw_mean_diversity kw_std kw_std_diversity valid_obs_ratio
count 864.0 864.000000 864.000000 864.000000 864.000000 864.000000 864.000000 720.000000 852.000000 720.000000 864.000000
mean 6.0 6.500000 11.500000 1.833333 18.062500 17.612269 0.610704 0.450651 0.383872 0.258780 0.954690
std 0.0 3.454052 6.926196 0.372894 15.464407 15.519700 0.440661 0.449071 0.240635 0.221398 0.146853
min 6.0 1.000000 0.000000 1.000000 4.000000 1.000000 0.051354 0.000147 0.004832 0.000040 0.250000
25% 6.0 3.750000 5.750000 2.000000 8.000000 8.000000 0.243508 0.071875 0.181358 0.067408 1.000000
50% 6.0 6.500000 11.500000 2.000000 9.000000 9.000000 0.538116 0.317255 0.393632 0.216758 1.000000
75% 6.0 9.250000 17.250000 2.000000 32.500000 32.500000 0.869120 0.683627 0.570940 0.397218 1.000000
max 6.0 12.000000 23.000000 2.000000 46.000000 46.000000 2.192869 2.033802 1.112752 1.100332 1.000000

In [7]:
stats = ec.uncertaintyStats(ods)
stats.loc['informal_settlement']


Out[7]:
AnswerID_count valid_obs_ratio
index
count 15.000000 15.000000
mean 13.266667 0.910585
std 19.436588 0.089651
min 2.000000 0.705574
25% 3.000000 0.893042
50% 5.000000 0.949663
75% 15.000000 0.970046
max 78.000000 0.976135

In [8]:
ep.plotAnswerIDCount(ohp)



In [9]:
ep.plotValidObsRatio(ohp, 'Weekday')
ep.plotValidObsRatio(ohp, 'Saturday')
ep.plotValidObsRatio(ohp, 'Sunday')



In [36]:
ods.name = 'demand_summary'
ohp.name = 'hourly_profiles'
validmodels = ec.dataIntegrity([ods, ohp], min_answerid, min_obsratio)
validmodels


Out[36]:
valid_data uncertainty_index valid_unit_count unit
submodel_name
demand_summary class YearsElectrified M... 0.394737 291 valid_AnswerID_count
hourly_profiles class YearsElectrified ... 0.361172 2.28236e+06 total_valid_hours

Explore Model Integrity

How well does my new model match my existing model?

User Input

Setup model comparison.


In [17]:
customer_class = 'informal_settlement'
daytype ='Weekday'
years_electrified = 8

Expert model subclass load profile


In [100]:
ep.plotHourlyProfiles(customer_class, 'expert', daytype, 10)


Corresponding data model subclass load profile


In [101]:
ep.plotHourlyProfiles(customer_class, 'data', daytype, 10,   
                            model_dir=experiment_dir, data=ohp)


Visualising model similarity between old (expert) and new (data) models


In [37]:
#Get new valid submodels
valid_new_ds = validmodels.at['demand_summary','valid_data']
valid_new_hp = validmodels.at['hourly_profiles','valid_data']
new_dsts = 'M_kw_mean'
new_hpts = 'kw_mean'

#Get expert model
ex_ds, ex_hp, ex_dsts, ex_hpts = bm0.benchmarkModel()

#Calculate model similarity
euclid_ds, count_ds, merged_ds = ec.modelSimilarity(ex_ds, ex_dsts, valid_new_ds, new_dsts, 'ds')
euclid_hp, count_hp, merged_hp = ec.modelSimilarity(ex_hp, ex_hpts, valid_new_hp, new_hpts, 'hp')

#Visualise model similarity
ep.plotProfileSimilarity(merged_hp, customer_class, daytype) 
ep.plotDemandSimilarity(merged_ds)
ep.multiplotDemandSimilarity(merged_ds)



In [34]:
valid_new_hp.loc[(valid_new_hp['class']=='informal_settlement')&(valid_new_hp.YearsElectrified==15)].describe()


Out[34]:
YearsElectrified month hour AnswerID_count total_hours valid_hours kw_mean kw_mean_diversity kw_std kw_std_diversity valid_obs_ratio
count 857.0 857.000000 857.000000 857.000000 857.000000 857.000000 857.000000 857.000000 857.000000 857.000000 857.000000
mean 15.0 6.458576 11.428238 76.147025 760.122520 708.315053 0.424906 0.457494 0.233758 0.227243 0.932442
std 0.0 3.437149 6.906911 4.050712 617.751026 575.399042 0.177824 0.175335 0.097603 0.075029 0.020268
min 15.0 1.000000 0.000000 67.000000 268.000000 231.000000 0.161715 0.161205 0.053549 0.055520 0.861940
25% 15.0 3.000000 5.000000 78.000000 312.000000 290.000000 0.258576 0.332839 0.148611 0.175167 0.928315
50% 15.0 6.000000 11.000000 78.000000 390.000000 361.000000 0.418785 0.446462 0.246862 0.235227 0.935175
75% 15.0 9.000000 17.000000 78.000000 1537.000000 1369.000000 0.542743 0.572053 0.308626 0.282293 0.945513
max 15.0 12.000000 23.000000 78.000000 1794.000000 1661.000000 0.968354 1.046596 0.455984 0.410904 0.961538

In [72]:
ep.plotProfileSimilarity(merged_hp, 'informal_settlement', 'Weekday')



In [17]:
pd.DataFrame(data = [[euclid_ds, count_ds], [euclid_hp, count_hp]], index=['demand_summary','hourly_profiles'],
                  columns=['euclidean distance','data point count'])


Out[17]:
euclidean distance data point count
demand_summary 535.355975 21
hourly_profiles 31.285664 13943

Explore Abstraction Integrity

How well does my model represent reality?

Test class homogeneity

  • spread of magnitude of hourly demand for each customer class (min, max, 25%, 50%, 75%) - do one massive boxplot for all customer classes & all hours - show yearly or monthly view

In [10]:
ep.plotMaxDemandSpread(md)


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-10-f3e43ec857dc> in <module>()
----> 1 ep.plotMaxDemandSpread(md)

NameError: name 'md' is not defined